Goto

Collaborating Authors

 Heves County


Euclid's Gift: Enhancing Spatial Perception and Reasoning in Vision-Language Models via Geometric Surrogate Tasks

Lian, Shijie, Wu, Changti, Yang, Laurence Tianruo, Yuan, Hang, Yu, Bin, Zhang, Lei, Chen, Kai

arXiv.org Artificial Intelligence

Spatial intelligence spans a rich suite of abilities, including visualising and transforming shapes, mentally rotating objects, judging relational positions and containment, and estimating numerosity. However, it still remains a critical unresolved challenge for Multimodal Large Language Models (MLLMs). To fill this gap, we propose to treat Euclidean geometry problem-solving as a surrogate task. Specifically, we meticulously constructed a curated multimodal dataset, called Euclid30K, comprising approximately 30K plane and solid geometry problems. Furthermore, to enable the model to learn and apply Euclidean principles from these geometry problems, we fine-tuned seven model variants (spanning 3--72B parameters) from the Qwen2.5VL, Qwen3VL, and RoboBrain2.0 families using Group Relative Policy Optimization (GRPO), inspiring the models to identify shapes, count, and relate entities, and perform multi-step deductive reasoning using Euclidean principles. Our experiments demonstrate that the resulting models achieve substantial zero-shot gains across four spatial reasoning benchmarks (Super-CLEVR, Omni3DBench, VSI-Bench, and MindCube) without any task-specific adaptations. Notably, after training on the Euclid30K, the mean VSI-Bench accuracy rose from 36.6\% to 41.8\% (+5.2\%), and the mean MindCube accuracy rose from 31.4\% to 38.1\% (+6.7\%). To our knowledge, this is the first systematic study showing that geometry-centric fine-tuning can confer vision-language models with broadly transferable spatial skills. Code and Euclid30K dataset can be found in \href{https://zgca-ai4edu.github.io/Euclids_Gift}{this}.


Research information in the light of artificial intelligence: quality and data ecologies

Azeroual, Otmane, Koltay, Tibor

arXiv.org Artificial Intelligence

The amount of data, defined as a "reinterpretable representation of information in a formalized manner, suitable for communication, interpretation, or processing" [1] is constantly increasing in varied institutions. Particularly affected is the amount of research information (such as publication data, personal data, project data, third-party funded data, etc.) in universities and research institutions. This means that research results can not only be verified and interpreted, but it must be understood how these results came about and how they can be used. As the preparation, utilization and preservation of a wide variety of research information has always been an important core task for these institutions and their libraries, as they can take over the organization of all information about the data stocks and their secure longterm archiving. The usefulness of useful research information depends very much on the quality of the data, recorded there. Nowadays, the topic of data quality (DQ) is becoming therefore more and more important both in theory and practice. This is not surprising, since securing and improving it is playing an increasingly important role, especially in the course of rapidly growing data stocks and the increasing use of RIM. Data quality is defined as properties of data in relation to their ability to meet specified requirements [2,3]. To ensure a high level of DQ, scientifically proven methods and procedures are required.